Code to Generate a Full List of Charlson ICD-10 codes

The full tabular list of ICD-10 codes is available online. In particular, the 2018 ICD 10 Code Descriptions containing the valid ICD-10-CM codes and their full code titles. The zip file contains a file named icd10cm_codes_2018.txt which is a space-delimited text file with all codes and descriptions.


In [298]:
icd10file = "icd10cm_codes_2018.txt"
codes = {}
with open(icd10file) as fp:
    for cnt, line in enumerate(fp):
        (code, desc) = line.split(' ', 1)
        codes[code] = desc.strip()

The ICD-10 codes for Charlson Comorbities are found in the following paper:

Quan H, Sundararajan V, Halfon P, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care. 2005;43(11):1130‐1139. doi:10.1097/01.mlr.0000182534.19832.83.

Notes during development

  • The paper has E100 listed under "Diabetes w/o chronic complications" but that code does not exist
  • The paper has B20x-B22.x, B24.x for AIDS/HIV but only B20 exists

In [303]:
import re

# Regular expressions for each of the Charlson comorbitidities
# Most of this expression are easy to make to the original text except for
# the "Any malignancy..." morbidity which require multiple expressions to 
# handle double digit matching (e.g. 00 - 26)
charlson = (("Myocardial infarction", 
             "I21|I22|I252"),
            ("Congestive heart failure", 
             "I099|I110|I130|I132|I255|I420|I42[5-9]|I43|I50|P290"),
            ("Peripheral vascular disease",
             "I70|I71|I731|I738|I739|I771|I790|I792|K551|K558|K559|Z958|Z959"),
            ("Cerebrovascular disease",
             "G45|G46|H340|I6[0-9]"),
            ("Dementia",
             "F0[0-3]|F051|G30|G311"),
            ("Chronic pulmonary disease",
             "I278|I279|J4[0-7]|J6[0-7]|J684|J701|J703"),
            ("Rheumatic disease",
             "M05|M06|M315|M3[2-4]|M351|M353|M360"),
            ("Peptic ulcer disease",
             "K2[5-8]"),
            ("Mild liver disease",
             "B18|K70[0-3]|K709|K71[3-5]|K717|K73|K74|K760|K76[2-4]|K768|K769|Z944"),
            ("Diabetes without chronic complication",
             "E100|E101|E106|E108|E109|E110|E111|E116|E118|E119|E120|E121|E126|E128|E129|E130|E131|E136|E138|E139|E140|E141|E146|E148|E149"),
            ("Diabetes with chronic complication",
             "E10[2-5]|E107|E11[2-5]|E117|E12[2-5]|E127|E13[2-5]|E137|E14[2-5]|E147"),
            ("Hemiplegia or paraplegia",
             "G041|G114|G801|G802|G81|G82|G83[0-4]|G839"),
            ("Renal disease",
             "I120|I131|N03[2-7]|N05[2-7]|N18|N19|N250|Z49[0-2]|Z940|Z992"),
            # C00.x–C26.x, C30.x–C34.x, C37.x– C41.x, C43.x, C45.x–C58.x, C60.x– C76.x, C81.x–C85.x, C88.x, C90.x–C97.x
            ("Any malignancy, including lymphoma and leukemia, except malignant neoplasm of skin",
             "C0[0-9]|C1[0-9]|C2[0-6]|C3[0-4]|C3[7-9]|C4[0-1]|C43|C4[5-9]|C5[0-8]|C6[0-9]|C7[0-6]|C8[1-5]|C88|C9[0-7]"),
            ("Moderate or severe liver disease",
             "I850|I859|I864|I982|K704|K711|K721|K729|K765|K766|K767"),
            ("Metastatic solid tumor",
             "C7[7-9]|C80"),
            ("AIDS/HIV",
             "B2[0-2]|B24"))

columns = [ column for column, regex in charlson]
regexs = [ re.compile(regex) for column, regex in charlson ]

# Output file has a column for each Charlson comorbity along with a code and description
header = columns + ["Code", "Description"]

output_file = "charlson_matrix.csv"
with open(output_file, 'w') as fp:
    # Write header to output file
    fp.write(",".join(["\"%s\"" % (col) for col in header]))
    fp.write("\n")

    for code, desc in codes.items():
        answer = list(map(lambda regex: 1 if regex.match(code) else 0, regexs))
        if sum(answer) > 0:
            # We matched one of the Charlson comorbidities, so save this code to our output file
            fp.write(",".join([str(i) for i in answer] + ["\"%s\"" % (s,) for s in [code, desc]]))
            fp.write("\n")